39 research outputs found

    Cold Storage Data Archives: More Than Just a Bunch of Tapes

    Full text link
    The abundance of available sensor and derived data from large scientific experiments, such as earth observation programs, radio astronomy sky surveys, and high-energy physics already exceeds the storage hardware globally fabricated per year. To that end, cold storage data archives are the---often overlooked---spearheads of modern big data analytics in scientific, data-intensive application domains. While high-performance data analytics has received much attention from the research community, the growing number of problems in designing and deploying cold storage archives has only received very little attention. In this paper, we take the first step towards bridging this gap in knowledge by presenting an analysis of four real-world cold storage archives from three different application domains. In doing so, we highlight (i) workload characteristics that differentiate these archives from traditional, performance-sensitive data analytics, (ii) design trade-offs involved in building cold storage systems for these archives, and (iii) deployment trade-offs with respect to migration to the public cloud. Based on our analysis, we discuss several other important research challenges that need to be addressed by the data management community

    DNA CODING FOR IMAGE STORAGE USING IMAGE COMPRESSION TECHNIQUES

    Get PDF
    International audienceLiving in the age of the digital media explosion the urge for finding new efficient methods of data storage increases significantly. Existing storage devices such as hard disks, flash, tape or even optical storage have limited durability in the range of 5 to 20 years. Recent studies have proven that the method of DNA data storage introduces a strong candidate to achieve data longevity. The DNA's biological properties permit the compression of a great amount of information into an extraordinary small volume while also promising efficient data storage for millions of years with no loss of information. This work proposes a new encoding scheme especially designed for the encoding of still images, extending the existing algorithms of DNA data storage by introducing image compression techniques

    Cheap Data Analytics using Cold Storage Devices

    Get PDF
    Enterprise databases use storage tiering to lower capital and operational expenses. In such a setting, data waterfalls from an SSD-based high-performance tier when it is "hot" (frequently accessed) to a disk-based capacity tier and finally to a tape-based archival tier when "cold" (rarely accessed). To address the unprecedented growth in the amount of cold data, hardware vendors introduced new devices named Cold Storage Devices (CSD) explicitly targeted at cold data workloads. With access latencies in tens of seconds and cost/GB as low as $0.01/GB/month, CSD provide a middle ground between the low-latency (ms), high-cost, HDD-based capacity tier, and high-latency (min to h), low-cost, tape-based, archival tier. Driven by the price/performance aspect of CSD, this paper makes a case for using CSD as a replacement for both capacity and archival tiers of enterprise databases. Although CSD offer major cost savings, we show that current database systems can suffer from severe performance drop when CSD are used as a replacement for HDD due to the mismatch between design assumptions made by the query execution engine and actual storage characteristics of the CSD. We then build a CSD-driven query execution framework, called Skipper, that modifies both the database execution engine and CSD scheduling algorithms to be aware of each other. Using results from our implementation of the architecture based on PostgreSQL and OpenStack Swift, we show that Skipper is capable of completely masking the high latency overhead of CSD, thereby opening up CSD for wider adoption as a storage tier for cheap data analytics over cold data

    HetExchange: Encapsulating heterogeneous CPU-GPU parallelism in JIT compiled engines

    Get PDF
    Modern server hardware is increasingly heterogeneous as hardware accelerators, such as GPUs, are used together with multicore CPUs to meet the computational demands of modern data analytics workloads. Unfortunately, query parallelization techniques used by analytical database engines are designed for homogeneous multicore servers, where query plans are parallelized across CPUs to process data stored in cache coherent shared memory. Thus, these techniques are unable to fully exploit available heterogeneous hardware, where one needs to exploit task-parallelism of CPUs and data-parallelism of GPUs for processing data stored in a deep, non-cache-coherent memory hierarchy with widely varying access latencies and bandwidth. In this paper, we introduce HetExchange–a parallel query execution framework that encapsulates the heterogeneous parallelism of modern multi-CPU–multi-GPU servers and enables the parallelization of (pre-)existing sequential relational operators. In contrast to the interpreted nature of traditional Exchange, HetExchange is designed to be used in conjunction with JIT compiled engines in order to allow a tight integration with the proposed operators and generation of efficient code for heterogeneous hardware. We validate the applicability and efficiency of our design by building a prototype that can operate over both CPUs and GPUs, and enables its operators to be parallelism- and data-location-agnostic. In doing so, we show that efficiently exploiting CPU–GPU parallelism can provide 2.8x and 6.4x improvement in performance compared to state-of-the-art CPU-based and GPU-based DBMS

    Taster: Self-Tuning, Elastic and Online Approximate Query Processing

    Get PDF
    Current Approximate Query Processing (AQP) engines are far from silver-bullet solutions, as they adopt several static design decisions that target specific workloads and deployment scenarios. Offline AQP engines target deployments with large storage budget, and offer substantial performance improvement for predictable workloads, but fail when new query types appear, i.e., due to shifting user interests. To the other extreme, online AQP engines assume that query workloads are unpredictable, and therefore build all samples at query time, without reusing samples (or parts of them) across queries. Clearly, both extremes miss out on different opportunities for optimizing performance and cost. In this paper, we present Taster, a self-tuning, elastic, online AQP engine that synergistically combines the benefits of online and offline AQP. Taster performs online approximation by injecting synopses (samples and sketches) into the query plan, while at the same time it strategically materializes and reuses synopses across queries, and continuously adapts them to changes in the workload and to the available storage resources. Our experimental evaluation shows that Taster adapts to shifting workload and to varying storage budgets, and always matches or significantly outperforms the state-of-the-art performing AQP approaches (online or offline)

    Hardware-conscious Hash-Joins on GPUs

    Get PDF
    Traditionally, analytical database engines have used task parallelism provided by modern multisocket multicore CPUs for scaling query execution. Over the past few years, GPUs have started gaining traction as accelerators for processing analytical queries due to their massively data-parallel nature and high memory bandwidth. Recent work on designing join algorithms for CPUs has shown that carefully tuned join implementations that exploit underlying hardware can outperform naive, hardware-oblivious counterparts and provide excellent performance on modern multicore servers. However, there has been no such systematic analysis of hardware-conscious join algorithms for GPUs that systematically explores the dimensions of partitioning (partitioned versus non-partitioned joins), data location (data fitting and not fitting in GPU device memory), and access pattern (skewed versus uniform). In this paper, we present the design and implementation of a family of novel, partitioning-based GPU-join algorithms that are tuned to exploit various GPU hardware characteristics for working around the two main limitations of GPUs–limited memory capacity and slow PCIe interface. Using a thorough evaluation, we show that: i) hardware-consciousness plays a key role in GPU joins similar to CPU joins and our join algorithms can process 1 Billion tuples/second even if no data is GPU resident, ii) radix partitioning-based GPU joins that are tuned to exploit GPU hardware can substantially outperform non-partitioned hash joins, iii) hardware-conscious GPU joins can effectively overcome GPU limitations and match, or even outperform, state-of-the-art CPU joins

    The Case For Heterogeneous HTAP

    Get PDF
    ABSTRACT Modern database engines balance the demanding requirements of mixed, hybrid transactional and analytical processing (HTAP) workloads by relying on i) global shared memory, ii) system-wide cache coherence, and iii) massive parallelism. Thus, database engines are typically deployed on multi-socket multi-cores, which have been the only platform to support all three aspects. Two recent trends, however, indicate that these hardware assumptions will be invalidated in the near future. First, hardware vendors have started exploring alternate non-cache-coherent shared-memory multi-core designs due to escalating complexity in maintaining coherence across hundreds of cores. Second, as GPGPUs overcome programmability, performance, and interfacing limitations, they are being increasingly adopted by emerging servers to expose heterogeneous parallelism. It is thus necessary to revisit database engine design because current engines can neither deal with the lack of cache coherence nor exploit heterogeneous parallelism. In this paper, we make the case for Heterogeneous-HTAP (H 2 TAP), a new architecture explicitly targeted at emerging hardware. H 2 TAP engines store data in shared memory to maximize data freshness, pair workloads with ideal processor types to exploit heterogeneity, and use message passing with explicit processor cache management to circumvent the lack of cache coherence. Using Caldera, a prototype H 2 TAP engine, we show that the H 2 TAP architecture can be realized in practice and can offer performance competitive with specialized OLTP and OLAP engines
    corecore